AITopics

2501.18084

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Morel-Balbi, Sebastian, Kirkley, Alec

Learning when to rank: Estimation of partial rankings from sparse, noisy comparisons

arXiv.org Machine LearningJan-5-2025

A common task arising in various domains is that of ranking items based on the outcomes of pairwise comparisons, from ranking players and teams in sports to ranking products or brands in marketing studies and recommendation systems. Statistical inference-based methods such as the Bradley-Terry model, which extract rankings based on an underlying generative model of the comparison outcomes, have emerged as flexible and powerful tools to tackle the task of ranking in empirical data. In situations with limited and/or noisy comparisons, it is often challenging to confidently distinguish the performance of different items based on the evidence available in the data. However, existing inference-based ranking methods overwhelmingly choose to assign each item to a unique rank or score, suggesting a meaningful distinction when there is none. Here, we address this problem by developing a principled Bayesian methodology for learning partial rankings -- rankings with ties -- that distinguishes among the ranks of different items only when there is sufficient evidence available in the data. Our framework is adaptable to any statistical ranking method in which the outcomes of pairwise observations depend on the ranks or scores of the items being compared. We develop a fast agglomerative algorithm to perform Maximum A Posteriori (MAP) inference of partial rankings under our framework and examine the performance of our method on a variety of real and synthetic network datasets, finding that it frequently gives a more parsimonious summary of the data than traditional ranking, particularly when observations are sparse.

artificial intelligence, machine learning, ranking, (17 more...)

2501.02505

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (0.68)
Education (0.67)
Leisure & Entertainment > Games > Chess (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Sarkar, Sreetama, Kundu, Souvik, Beerel, Peter A.

Linearizing Models for Efficient yet Robust Private Inference

arXiv.org Artificial IntelligenceFeb-8-2024

The growing concern about data privacy has led to the development of private inference (PI) frameworks in client-server applications which protects both data privacy and model IP. However, the cryptographic primitives required yield significant latency overhead which limits its wide-spread application. At the same time, changing environments demand the PI service to be robust against various naturally occurring and gradient-based perturbations. Despite several works focused on the development of latency-efficient models suitable for PI, the impact of these models on robustness has remained unexplored. Towards this goal, this paper presents RLNet, a class of robust linearized networks that can yield latency improvement via reduction of high-latency ReLU operations while improving the model performance on both clean and corrupted images. In particular, RLNet models provide a "triple win ticket" of improved classification accuracy on clean, naturally perturbed, and gradient-based perturbed images using a shared-mask shared-weight architecture with over an order of magnitude fewer ReLUs than baseline models. To demonstrate the efficacy of RLNet, we perform extensive experiments with ResNet and WRN model variants on CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. Our experimental evaluations show that RLNet can yield models with up to 11.14x fewer ReLUs, with accuracy close to the all-ReLU models, on clean, naturally perturbed, and gradient-based perturbed images. Compared with the SoTA non-robust linearized models at similar ReLU budgets, RLNet achieves an improvement in adversarial accuracy of up to ~47%, naturally perturbed accuracy up to ~16.4%, while improving clean image accuracy up to ~1.5%.

accuracy, adversarial image, pr model, (15 more...)

2402.05521

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceJul-24-2023

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning

Lai, Viet Dac, Salinas, Abel, Tan, Hao, Bui, Trung, Tran, Quan, Yoon, Seunghyun, Deilamsalehy, Hanieh, Dernoncourt, Franck, Nguyen, Thien Huu

Punctuation restoration is an important task in automatic speech recognition (ASR) which aim to restore the syntactic structure of generated ASR texts to improve readability. While punctuated texts are abundant from written documents, the discrepancy between written punctuated texts and ASR texts limits the usability of written texts in training punctuation restoration systems for ASR texts. This paper proposes a reinforcement learning method to exploit in-topic written texts and recent advances in large pre-trained generative language models to bridge this gap. The experiments show that our method achieves state-of-the-art performance on the ASR test set on two benchmark datasets for punctuation restoration.

machine learning, pr model, reinforcement learning, (18 more...)

2307.12949

Country:

North America > United States > California (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
North America > United States > Oregon (0.04)

Genre: Research Report (0.50)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.86)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

arXiv.org Artificial IntelligenceApr-26-2023

Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference

Kundu, Souvik, Zhang, Yuke, Chen, Dake, Beerel, Peter A.

Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference. In this paper, we present a model optimization method that allows a model to learn to be shallow. In particular, we leverage the ReLU sensitivity of a convolutional block to remove a ReLU layer and merge its succeeding and preceding convolution layers to a shallow block. Unlike existing ReLU reduction methods, our joint reduction method can yield models with improved reduction of both ReLUs and linear operations by up to 1.73x and 1.47x, respectively, evaluated with ResNet18 on CIFAR-100 without any significant accuracy-drop.

artificial intelligence, machine learning, reduction, (17 more...)

2304.13274

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
Asia (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Li, Chen, Tsourdos, Antonios, Guo, Weisi

A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

arXiv.org Artificial IntelligenceAug-9-2022

Deep Learning (DL) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The high complexity of DL models and its widespread adoption has led to global energy consumption doubling every 3-4 months. Currently, the relationship between the DL model configuration and energy consumption is not well established. At a general computational energy model level, there is both strong dependency to both the hardware architecture (e.g. generic processors with different configuration of inner components- CPU and GPU, programmable integrated circuits - FPGA), as well as different interacting energy consumption aspects (e.g., data movement, calculation, control). At the DL model level, we need to translate non-linear activation functions and its interaction with data into calculation tasks. Current methods mainly linearize nonlinear DL models to approximate its theoretical FLOPs and MACs as a proxy for energy consumption. Yet, this is inaccurate (est. 93\% accuracy) due to the highly nonlinear nature of many convolutional neural networks (CNNs) for example. In this paper, we develop a bottom-level Transistor Operations (TOs) method to expose the role of non-linear activation functions and neural network structure in energy consumption. We translate a range of feedforward and CNN models into ALU calculation tasks and then TO steps. This is then statistically linked to real energy consumption values via a regression model for different hardware configurations and data sets. We show that our proposed TOs method can achieve a 93.61% - 99.51% precision in predicting its energy consumption.

consumption, energy consumption, opération, (16 more...)

2205.15062

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India (0.04)

Genre: Research Report (0.40)

Industry: Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Godahewa, Rakshitha, Bandara, Kasun, Webb, Geoffrey I., Smyl, Slawek, Bergmeir, Christoph

Ensembles of Localised Models for Time Series Forecasting

arXiv.org Machine LearningDec-30-2020

With large quantities of data typically available nowadays, forecasting models that are trained across sets of time series, known as Global Forecasting Models (GFM), are regularly outperforming traditional univariate forecasting models that work on isolated series. As GFMs usually share the same set of parameters across all time series, they often have the problem of not being localised enough to a particular series, especially in situations where datasets are heterogeneous. We study how ensembling techniques can be used with generic GFMs and univariate models to solve this issue. Our work systematises and compares relevant current approaches, namely clustering series and training separate submodels per cluster, the so-called ensemble of specialists approach, and building heterogeneous ensembles of global and local models. We fill some gaps in the approaches and generalise them to different underlying GFM model types. We then propose a new methodology of clustered ensembles where we train multiple GFMs on different clusters of series, obtained by changing the number of clusters and cluster seeds. Using Feed-forward Neural Networks, Recurrent Neural Networks, and Pooled Regression models as the underlying GFMs, in our evaluation on six publicly available datasets, the proposed models are able to achieve significantly higher accuracy than baseline GFM models and univariate forecasting methods.

dataset, gfm, time sery, (11 more...)

2012.15059

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.06)
(3 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry: Energy > Renewable (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hewamalage, Hansika, Bergmeir, Christoph, Bandara, Kasun

Global Models for Time Series Forecasting: A Simulation Study

arXiv.org Machine LearningDec-22-2020

In the current context of Big Data, the nature of many forecasting problems has changed from predicting isolated time series to predicting many time series from similar sources. This has opened up the opportunity to develop competitive global forecasting models that simultaneously learn from many time series. But, it still remains unclear when global forecasting models can outperform the univariate benchmarks, especially along the dimensions of the homogeneity/heterogeneity of series, the complexity of patterns in the series, the complexity of forecasting models, and the lengths/number of series. Our study attempts to address this problem through investigating the effect from these factors, by simulating a number of datasets that have controllable time series characteristics. Specifically, we simulate time series from simple data generating processes (DGP), such as Auto Regressive (AR) and Seasonal AR, to complex DGPs, such as Chaotic Logistic Map, Self-Exciting Threshold Auto-Regressive, and Mackey-Glass Equations. The data heterogeneity is introduced by mixing time series generated from several DGPs into a single dataset. The lengths and the number of series in the dataset are varied in different scenarios. We perform experiments on these datasets using global forecasting models including Recurrent Neural Networks (RNN), Feed-Forward Neural Networks, Pooled Regression (PR) models and Light Gradient Boosting Models (LGBM), and compare their performance against standard statistical univariate forecasting techniques. Our experiments demonstrate that when trained as global forecasting models, techniques such as RNNs and LGBMs, which have complex non-linear modelling capabilities, are competitive methods in general under challenging forecasting scenarios such as series having short lengths, datasets with heterogeneous series and having minimal prior knowledge of the patterns of the series.

dgp, scenario, time sery, (16 more...)

2012.12485

Country:

Europe > Austria > Vienna (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)